DSA: Decentralized Double Stochastic Averaging Gradient Algorithm

نویسندگان

Aryan Mokhtari

Alejandro Ribeiro

چکیده

This paper considers convex optimization problems where nodes of a network have access to summands of a global objective. Each of these local objectives is further assumed to be an average of a finite set of functions. The motivation for this setup is to solve large scale machine learning problems where elements of the training set are distributed to multiple computational elements. The decentralized double stochastic averaging gradient (DSA) algorithm is proposed as a solution alternative that relies on: (i) The use of local stochastic averaging gradients. (ii) Determination of descent steps as differences of consecutive stochastic averaging gradients. Strong convexity of local functions and Lipschitz continuity of local gradients is shown to guarantee linear convergence of the sequence generated by DSA in expectation. Local iterates are further shown to approach the optimal argument for almost all realizations. The expected linear convergence of DSA is in contrast to the sublinear rate characteristic of existing methods for decentralized stochastic optimization. Numerical experiments on a logistic regression problem illustrate reductions in convergence time and number of feature vectors processed until convergence relative to these other alternatives.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

D$^2$: Decentralized Training over Decentralized Data

While training a machine learning model using multiple workers, each of which collects data from their own data sources, it would be most useful when the data collected from different workers can be unique and different. Ironically, recent analysis of decentralized parallel stochastic gradient descent (D-PSGD) relies on the assumption that the data hosted on different workers are not too differ...

متن کامل

Asynchronous Decentralized Parallel Stochastic Gradient Descent

Recent work shows that decentralized parallel stochastic gradient decent (D-PSGD) can outperform its centralized counterpart both theoretically and practically. While asynchronous parallelism is a powerful technology to improve the efficiency of parallelism in distributed machine learning platforms and has been widely used in many popular machine learning softwares and solvers based on centrali...

متن کامل

Averaging Asynchronously Using Double Linear Iterations∗

The distributed averaging problem is to devise a protocol which will enable the members of a group of n > 1 agents to asymptotically determine in a decentralized manner, the average of the initial values of their scalar agreement variables. A typical averaging protocol can be modeled by a linear iterative equation whose update matrices are doubly stochastic. Building on the ideas proposed in [1...

متن کامل

Distributed Averaging using non-convex updates

Motivated by applications in distributed sensing, a significant amount of effort has been directed towards developing energy efficient algorithms for information exchange on graphs. The problem of distributed averaging has been studied intensively because it appears in several applications such as estimation on ad hoc wireless and sensor networks. A Gossip Algorithm is an averaging algorithm th...

متن کامل

Stochastic averaging for SDEs with Hopf Drift and polynomial diffusion coefficients

It is known that a stochastic differential equation (SDE) induces two probabilistic objects, namely a difusion process and a stochastic flow. While the diffusion process is determined by the innitesimal mean and variance given by the coefficients of the SDE, this is not the case for the stochastic flow induced by the SDE. In order to characterize the stochastic flow uniquely the innitesimal cov...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Journal of Machine Learning Research

دوره 17 شماره

صفحات -

تاریخ انتشار 2016

DSA: Decentralized Double Stochastic Averaging Gradient Algorithm

نویسندگان

چکیده

منابع مشابه

D$^2$: Decentralized Training over Decentralized Data

Asynchronous Decentralized Parallel Stochastic Gradient Descent

Averaging Asynchronously Using Double Linear Iterations∗

Distributed Averaging using non-convex updates

Stochastic averaging for SDEs with Hopf Drift and polynomial diffusion coefficients

عنوان ژورنال:

اشتراک گذاری